PrizmDoc
Implementing PrizmDoc Server Caching Strategies

Why does PrizmDoc Server Cache Files?

The power behind Prizm Services’ ability to deliver viewable web content quickly and efficiently lies with its cache management. Viewing a multipage document requires that each document page be converted into a web compatible format such as JPEG, PNG or ideally SVG (which gives the highest fidelity upon scaling). Unfortunately, the conversion process is not instantaneous which means there is some delay before a page can be made viewable. Because PrizmDoc Server assumes a document will be viewed by more than one person over multiple sessions, it converts all the pages into web viewable intermediate objects that are stored in its cache folders.

The conversion process begins when the viewing session is started or with the first request to view a document page by a given viewing session. Typically, the viewable page data that is generated will then be made available to any subsequent request for the same pages, reducing the time to view to only the time it takes to download the page data to the browser. To summarize, the cached files help deliver viewing performance because the viewing objects are pre-generated and stored in the cache folders.

The Cost of the PrizmDoc Server Cache

The cached files require storage on some media device for some period of time. Cached files created for viewing may take up a considerable amount of space, so there is a need to have some control on the growth of the cache files. Fortunately, PrizmDoc Server does provide ways to deal with the storage usage demand of the cache with options for controlling both where the files are stored, and how long they are stored there. In fact, the cache contains different purposed folders which can be relocated to different devices which can spread the cache burden out to different devices if necessary.

Optimizing Cache Performance

The majority of the PrizmDoc Server cache is made up of pre-generated document pages which are readily available on demand. Caching these files is already a help in performance when the same document is viewed repeatedly. While there are three configurable cache folders locations, placing certain ones on more responsive media can result in better viewing experience with less burden on the server hosting the PrizmDoc Server service. The use of solid state drives (SSD) or Shared Memory (Linux only) minimizes input/output (I/O) latency and access times for cached files but these storage devices are typically much more confined in storage capacity.

Cache Strategies and Tradeoffs

Several scenarios are proposed below with purposed cache configuration solutions. The user should be familiar with the central configuration file settings as outlined in Central Configuration Options. Along with the central configuration file, there is a property in the JSON object which the application posts when requesting a new viewing session from PrizmDoc Server (refer to the PrizmDoc Server sample and the How To Adjust Caching Parameters for PrizmDoc Server topic).

The default settings in the central configuration file will cause viewing sessions to timeout after 20 minutes, and cached files to expire after one day. Also by default, the PrizmDoc Server cache folders will all be created within the same parent directory on the root drive. These default settings give a reader 20 minutes to read a document once the viewing session is started. After that time period, a new viewing session will need to be created for them to continue reading the document, either by refreshing their browser, or another mechanism you implement in your application.

The next time the same document is viewed, PrizmDoc Server will simply deliver the viewing objects that were created in the first viewing session to the same reader, or to any other reader viewing the same document, for about 24 hours after the first viewing session was created. When a reader (same or new) requests to read the document a day later, the cache process starts over because PrizmDoc will have already deleted the cached pages and will have to re-generate all the viewable content of the document again.

Scenario 1:

Viewing response appears slow even with caching enabled as lots of readers are interested in viewing the document.

Solution:

Set the cache.directory setting in the central configuration file to a faster SSD device or with Linux environments, set the content to a folder of the Shared Memory device (i.e. /dev/shm).

Example for Shared Memory Device
Copy Code
cache.directory: /dev/shm/Accusoft/Prizm/

The above setting in central configuration sets the cache directories to folders in Shared Memory on a Linux OS environment. Being faster than standard disk drives, PrizmDoc Server response will be typically quicker with less overall stress on the server to deliver viewing content.

The following configuration properties have been deprecated and will be removed in a future release. Alter these properties only if not using the central configuration file.

Set the GroupStateFolder setting in the pcc.config file to a faster SSD device or with Linux environments, set the content to a folder of the Shared Memory device (i.e. /dev/shm). The other cache folders noted in pcc.config, DocumentPath and TempcachePath, could benefit too if they were placed onto faster storage devices.

Example for Shared Memory Device
Copy Code
<GroupStateFolder>/dev/shm/Accusoft/Prizm/GroupState</GroupStateFolder>
<DocumentPath>/dev/shm/Accusoft/Prizm/DocumentCache</DocumentPath>
<TempcachePath>/dev/shm/Accusoft/Prizm/Cache</TempcachePath>

Scenario 2:

Viewing Clients are getting errors and the storage device used for the PrizmDoc Server cache is showing errors because the devices are full.

Solution:

Depending on available storage capacity of the selected device, the cache expiration period specified by viewing.cacheLifetime in central configuration may need to be shortened to accommodate cache load. Please note that the time period for viewing.cacheLifetime should not be any shorter than the viewing.sessionLifetime time period. Otherwise, the viewing.sessionLifetime will take precedence and the cache expiration period will be forced to the same value. The viewing.sessionLifetime time period can be shortened but at the penalty of reducing the amount of time a user has to read a document in a single viewing session.

Rather than changing the viewing session timeout period, try changing the size of the (fast) storage device.

Example for Quicker Cache Cleanup
Copy Code
viewing.sessionLifetime: 15m
viewing.cacheLifetime: 20m

The above settings set the viewing session timeout to 15 minutes and the life expectancy of any cached file to 20 minutes. After approximately 35 to 45 minutes, the cached files for a given document will be deleted. The exact time of cleanup can vary based on the scheduled nature of the cleanup processes and current load on the server.

The following configuration properties have been deprecated and will be removed in a future release. Alter these properties only if not using the central configuration file.

The cache expiration period may be specified by CacheExpirationPeriod in pcc.config. Please note that the time period for CacheExpirationPeriod should not be any shorter than the ViewingSessionTimeout time period.

If not practical to change device storage device size, try moving the directory specified TempcachePath to a different storage device and if that isn’t enough do the same for DocumentPath. Splitting cache folders to different dedicated storage devices can benefit performance by reducing disk latency for Hard Disk Drives (HDD) compared to having one HDD serving all the viewing sessions.

Example for Quicker Cache Cleanup

Copy Code
<CacheExpirationPeriod>20m</CacheExpirationPeriod>   <ViewingSessionTimeout>15m</ViewingSessionTimeout>

Scenario 3:

Your application views a lot of large documents and users are not able to read them in time before they get a viewing session timeout error.

Solution:

The default setting in the central configuration file for viewing.sessionLifetime is 20 minutes. It can be increased to a larger value but that means PrizmDoc Server will have more resources to track at any given moment which could affect performance and host server capacity.

Example of Longer Viewing Session Duration
Copy Code
viewing.sessionLifetime: 1h
viewing.cacheLifetime: 1d

The above settings increase the ability for users to peruse a given document for an hour. Cache resources for the document will be removed 25+ hours later. As above, there is variability for cache cleanup based on the scheduled nature of the cleanup processes and current load on the server.

The following configuration properties have been deprecated and will be removed in a future release. Alter these properties only if not using the central configuration file.

The default setting in the pcc.config file for ViewingSessionTimeout is 20 minutes. It can be increased to a larger value but that means PrizmDoc Server will have more resources to track at any given moment which could affect performance and host server capacity.

Example of Longer Viewing Session Duration
Copy Code
<ViewingSessionTimeout>1h</ViewingSessionTimeout>
<CacheExpirationPeriod>1d</CacheExpirationPeriod>

Scenario 4:

The documents served are fairly random and not typically shared with others.

- Or -

The image is watermarked uniquely for each Viewing Client and should not be shared.

Solution:

In this scenario, the cache resources are not likely to be needed except for the initial user. There is a property in the JSON object which the application posts when requesting a new viewing session from PrizmDoc Server that can be used to disable caching on a per-viewing-session basis. The property, serverCaching, should be set explicitly to the string value none when the application requests a POST operation to get a new viewing session ID. Each document uploaded to PrizmDoc Server will be converted without PrizmDoc Server looking for an existing copy of the document. After the viewing session times out, the cached items for the document will be removed on a predetermined schedule which should be fairly quick because no other viewing sessions are using the data. For example:

Example
Copy Code
POST /ViewingSession
{
...
    "serverCaching": "none",
...
}

After the viewing session timeout, the cache items should be removed fairly soon.

Summary

The PrizmDoc Server cache provides a mechanism to deliver document content in a timely matter. However, each application is different and may tax server resources differently or have more demanding requirements. Balancing resource constraints against user experience can be a difficult task that may require compromises. Faster hardware, more specifically high speed storage devices, coupled with an understanding of the options for adjusting how the PrizmDoc Server cache behaves should allow you to reach a desired level of performance while maintaining a good user

 

 


©2016. Accusoft Corporation. All Rights Reserved.

Send Feedback